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The patients with diseases that cause severe movement disabilities was 
noticeably increasing. These disabilities made patients unable to carry out 
their daily activities or interact with their external environment. However, 
the existence of human-computer interfaces (HCI) gave those patients a new 
hope to be able to interact once again. HCI enabled these patients to 


communicate with their environment by recognizing the movement of their 

eyes. Eye movements are recorded by an electro-oculogram (EOG) through 
Keywords: some electrodes that are put vertically and horizontally on the eyes. In this 
paper, EOG vertical and horizontal signals were analyzed to detect six eye 
movements (up, down, right, left, double blinking, and center). Three deep 
learning models namely convolution neural network (CNN), visual geometry 
group (VGG), and inception had been examined on filtered EOG signals. 
The experimental results reveal the superiority of the inception model in 
providing the best average accuracy 96.4%. Accordingly, a writing system is 
presented based on the detected movements. 
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1. INTRODUCTION 

Many diseases have emerged around the world that cause temporary or permanent paralysis for 
patients despite the integrity of the cognitive parts of their brains. However, they cause the muscle neurons to 
weaken, which leads to the suspension of voluntary muscles. Thus, their limbs are unable to carry out their 
vital functions [1], [2]. Unfortunately, the number of patients is increasing. As stated by the World Health 
Organization, there are more than 1,000 million people with disabilities all over the world, and they 
constitute approximately 15% of the world's population (i.e. one person out of 7). This is due to many factors 
including genetic factors, diet and environmental factors [3]. 

One of the famous diseases is amyotrophic lateral sclerosis (ALS). ALS is a motor nerve disease 
that causes atrophy in the muscles of the body and ultimately leads to paralysis, due to the weakness of the 
motor nerves of the upper and lower extremities, and gradually they stop sending nerve signals to the 
extremities. Consequently, the limbs stop performing their task and the patient loses control over them. 
However, some muscles survive this atrophy, including the muscles that support eye movement [4]. 

This disease usually begins at the age of 50. ALS affects five out of every 100,000 people 
worldwide. There are no known causes for this disease so far unless a family member had it before. There are 
clear signs and symptoms that distinguish it; first, the weakness of the motor nerves leads to a slight 
involuntary tremor in the fingers, hand and feet, and it develops with a change in sound and drooling, and in 
the final stage it reaches difficulty in breathing and death. Thus, the lives of these people remain difficult and 
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painful for them. They cannot carry out the tasks of their normal life, and they cannot communicate with their 
external environment except through the movement of their eyes [5]. 

There are many diseases that also lead to atrophy and death of the motor nerves, such as myasthenia 
gravis disease where the muscles are damaged and unable to receive nerve signals to the extremities, as well 
as Guillain-Barré syndrome (GBS) and other diseases. The causes of these diseases are due to environmental 
conditions, such as constant exposure to some heavy metals or electromagnetic waves, diets that lack basic 
elements or wrong habits such as smoking or genetic factors. However, the result remains the same, which is 
the inability of patients to practice their lives normally due to disability [6], [7]. Nevertheless, these patients 
still have the last tool for communication, which is the movement of their eyes, to be the only supporter of 
interfaces that help them to act and express what is inside them. 

In the past few decades, human-computer interfaces/human machine interfaces (HCI/HMI) have 
emerged and are known as human-machine interaction. They are designed and implemented to be the link 
between the human being and the computer, where the individual can take actions based on selecting from 
actions displayed on the computer screen. The effect of these actions can also be seen on the screen/machine. 
Many efficient human computer interfaces have been developed, such as electrical wheelchair control mobile 
robot control [8], [9]. From here, these interfaces can help these patients to overcome their disabilities by 
providing a mean to facilitate communication with their external environment by identifying directions of eye 
movements and using them in determining a specific procedure or writing a text. These interfaces depend on 
four main directions (top, bottom, right and left) with a blinking to indicate the choice of procedure as in 
most studies or other sub-directions (top left, top right, bottom left, and right) as in few studies [10], [11]. 

Eye movement directions can be determined using an electro-oculogram (EOG) that measures the 
signals produced by the electrical potential difference between the cornea and the retina. When the eye 
moves in different directions, positive pulses form on the cornea in the front, negative pulses on the retina in 
the back, and the eyeball become a bipolar with the positive cornea and the negative retina [12]. The 
amplitude of pulse will increase with the increment of rolling angle, and the width of the positive (negative) 
pulse is proportional to the duration of the eyeball rolling process [13], as shown in Figure 1. EOG is usually 
recorded using five electrodes placed around the eyes, a pair is placed above and below the left eye to 
measure vertical movement, a pair is placed to the right and left of the eye to measure horizontal movement, 
and the last one is placed on the center of the forehead to represent the ground. 


(1) looking straight ahead (2) rolling eyes upward (3) rolling eyes downward 


C® æ Ce 


0.06 p : 7 + 


Amplitude of EOG signals(V) 


0 0.5 1 15 2 
Original EOG Waveform/time(s) 


Figure 1. The waveform of eye upward and downward movements 


In this paper, the EOG signals have been acquired using PSL-IEOG2 device. This device is 
dedicated to recording EOG signals by placing the device's electrodes around the eye vertically and 
horizontally. Hence, the EOG signals are analyzed and six different directions of eye movement are 
determined (up, down, right, left, double blink, and center). A dataset of 500 pairs of EOG signals for each 
movement has been collected. Each pair of EOG signals (horizontal and vertical) is preprocessed and 
concatenated to represent the input for the examined models. Three models are examined in this study 
namely, convolution neural network (CNN), visual geometry group (VGG), and inception. The VGG and 
Inceptions architectures are modified to fit the task at hand. Furthermore, a user interface is proposed to ease 
the paralyzed people's daily life. The remaining of this paper is organized as follows: section 2 discusses the 
existing studies; section 3 presents the proposed method; the achieved results are provided in section 4, and 
finally the conclusion is given in section 5. 


Development of electrooculogram based human computer interface system using ... (Radwa Reda Hossieny) 


2412 O ISSN: 2302-9285 


2. LITERATURE REVIEW 

In recent years, many studies have considered the development of HCI systems based on the EOG 
signals [14]-[20]. According to the utilized techniques, these studies can be categorized in two main 
categories: i) studies that consider simple techniques such as thresholding and simple rules and ii) studies that 
consider sophisticated classifiers, such as support vector machine (SVM), linear discriminant analysis 
(LDA), decision tree and artificial neural networks (ANN). Key studies are summarized as follows: 

Research by Heo et al. [14] proposed a new practical electrode position on the forehead to measure 
EOG signals. Low pass filter with cutoff 10 Hz is used to avoid noise. the maximum peak and minimum 
valley values and their positions are captured as features. These features are classified using specific 
threshold. If the amplitude value crosses the high threshold, 300 samples are extracted centered on the peak 
and the max value, min value, max position and min position are defined, and if the difference between max 
position and min position is greater (less) than zero, it means right (left) movement. The vertical movements 
are classified in the same way. Six types of eye movements are classified (up, down, left, right, blink, and 
double blink) by this algorithm. The upper and lower thresholds of each channel are optimized according to 
individual EOG characteristics. The cursor is placed over the letter “E” which is at the center of the virtual 
keyboard. The cursor can move step by step according to the user eye movements. A character can be 
selected by double blink movement, then automatically the cursor is returned to ‘E’. This writing system has 
achieved an accuracy 91%. 

Research by Ang et al. [15] designed a user-friendly HCI system. Only EOG produced by one eye 
movement (double blink) is used to encode user’s intentions, control, and guide an automatically moving 
cursor on screen. In preprocessing stage, wavelet filtering is used instead of traditional band pass filtering to 
remove noise from raw recordings. The extracted features from EOG signal are Ll-norm, entropy and 
kurtosis. Ll-norm measures the signal's magnitude by summing the absolute values of all samples in the 
vector, entropy measures the amount of information in the signal and kurtosis measures the climaxing of the 
signal. Subsequently, the feature vectors are fed into SVM classifier. The experiments were carried out on 
eight subjects in both indoor and outdoor conditions. The average accuracy achieved is 84.42% for indoor 
and 71.50% for outdoor conditions. 

Research by Qi and Alias [16] collected EOG signals from five subjects. The EOG signals are 
preprocessed to remove noise and other interferences by Chebyshev 4th order band pass filter of frequency 
range between 0.1-50 Hz. Thereafter, three types of feature extraction have been utilized and they are as 
follows: autoregressive coefficients using burg method, statistical parameters such as kurtosis coefficients, 
Interquartile range, and power spectral density using Yule-Walker method. For classification, ANN and SVM 
have been examined, the latter has achieved the best accuracy 69.75%. 

It is noticeable that there is a compromise between accuracy and processing time. Moreover, to the 
best knowledge of authors, most of the existing studies have utilized traditional classifiers and this can be 
justified by the limited public data resources. Hence, in this study, we have collected our own dataset of 500 
pair of EOG signals for each eye movement and we are trying to improve the recognition accuracy without 
effecting the processing time. 


3. THE PROPOSED METHOD 

The proposed system architecture for HCI includes four stages, which are data acquisition (DAQ), 
preprocessing, classification, and user interface as shown in Figure 2. The first one is data acquisition (DAQ) 
which has been applied to collect digital data. The following stage is filtering data from noise and raising 
data quality, whichis called the pre-processing stage. The classification stage is in which the eye direction of 
the test data is determined. Finally, the user interface stage has been designed to integrate all system 
functionality and display to users. In the following subsections, each stage will be described in detail. 


Figure 2. The proposed architecture for the EOG based HCI 
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3.1. Data acquisition 

Modern DAQ systems consist of three essential components: sensors, DAQ measurement hardware, 
and a computer with software application. A sensor is a device that captures the change in a physical 
phenomenon and converts it into a measurable analog signal like voltage. A DAQ device converts the noisy 
signals into accurate forms by using signal conditioning circuit and also digitize analog signals into a stream 
of digital data by analog to digital converters (ADCs). The software programs can visualize, analyze and 
store the digitalized data on the computer with different formats. Due to the lack of public reliable EOG 
datasets, in this study, data have been acquired and collected using a special hardware that can measure EOG. 


3.1.1. Hardware device 

The sensors are silver chloride Ag/AgCl electrodes that are placed around the eye, horizontally and 
vertically. They are standard pre-gelled and self-adhesive disposable electrodes, that are the most used 
reference electrodes in electrochemical measurements for control systems. These electrodes outperform the 
others in retaining an essentially constant composition. 

The device includes two units: PSL-iEOG2 and PSL-DAQ (Figure 3). PSL-iEOG2 is a two channels 
EOG module, that produces EOG analog signal and EOG direction event that represent the actions in the 
signals. It uses DC 5 V input power with 750 V/V amplification. PSL-DAQ is designed to receive analog 
two-channel signals and digitalize these signals for reading, analysis, and processing. It is running at a 
sampling rate of 1,000 samples per second. The signal range is from 0 to 3.3 V with a center of 1.65 V. The 
hardware is connected to a powerful software to monitor and store the signals as LabVIEW and visual C++ 
libraries with 16-bit resolution [21]. 
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Figure 3. PSL-iEOG2 and PSL-DAQ configuration 


3.1.2. Software (PSL-DAQ RMSW) 

The software has two main screens DAQ view and load view. DAQ view screen monitors the two 
channels signal generated by the PSL-DAQ unit in real-time. This requires communication between the PSL- 
DAQ unit and the computer through the USB port. Finally, the two channels' signal data are saved 
simultaneously in PDQ format. 

On the other hand, load view is a screen to review and print the saved signal data. It also allows 
many options to utilize it like trace, center, auto, and full. Finally, it supports the conversion of multiple files 
from PDQ extension to txt extension to be compatible with different programs as MATLAB and others. To 
maintain synchronization between the horizontal and vertical eye signals, the combination of two units of 
PSL-IEOG2 is necessary. The first records the signal horizontally and the second records the signal 
vertically. Subsequently, the two units are connected to PSL-DAQ gender. It works as an electrical switch to 
control the transmission of the signal horizontally and vertically, and then the outputs can be digitized by 
PSL-DAQ, as shown in Figure 4. 
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Figure 4. Modular connection for synchronization 
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3.1.3. Dataset 

Data were collected from 50 healthy subjects with normal vision (26 males and 24 females) ranging 
in age from 20 to 55 years. The electrodes were placed around their eyes as follows, a pair of electrodes to 
the right and left of the eye for the horizontal channel, another pair above and below the left eye for the 
vertical channel, and a pair in the center of the forehead for reference. The signal was recorded by giving a 
set of commands in the following order: up, down, right, left, double blinking, and center, each command 
representing a class of the data set. The duration of recording the full signal ranges from 12 to 20 seconds so 
that the difference between each command and the other is 1 second. The recording was done ten times for 
each one of the subjects to obtain ten different signals. Hence, the total number is 500 signals. Each signal 
has been segmented into six different signals; each one represents specific movement from the six classes. 
Thus, 500 pairs of EOG signals have been obtained for each movement. Each pair of EOG signals (horizontal 
and vertical) represents the system inputs. 


3.2. Pre-processing 

After the signal acquisition phase, the preprocessing phase is carried out in two steps. In the first 
step, the EOG signal is filtered using a second-order Butterworth band pass filter. The filter order has been 
empirically chosen. The filter is applied with a cut-off range from 0.5 to 20 Hz which is the bandwidth of 
EOG signals. Figure 5(a) shows the EOG signal before filtering. In contrast, Figure 5(b) shows the waveform 
of EOG signal after filtering. The second step is applying down sampling for EOG filtered signals to get the 
best accuracy with minimum computation. Hence, each signal is down sampled to 100 samples without 
losing any significant information from the signal. 


Input Signal 


Amplitude 
aOaN 


1 M 1 1 L 1 1 
(0) 200 400 600 800 1000 1200 1400 1600 
Time 


(a) 


Filtered Signal 


Amplitude 


2 O-N 
{4 


oO 200 400 600 800 1000 1200 1400 1600 
Time 


(b) 


Figure 5. EOG signal (a) before filtering and (b) after filtering 


3.3. Classification 

Regarding the classification phase, the filtered signals are fed into a deep learning model to be 
classified to one of the six eye movements: up, down, left, right, center, and double blinking. For comparison, 
three deep learning models have been evaluated namely CNN, VGG and inception models. Each model can 
be discussed briefly as follows: 


3.3.1. Convolutional neural network 

CNN is considered a version of a multilayer perceptron network for supervised learning. It consists 
of an input layer, multiple hidden layers, and output layers. The input passes to the network through the input 
layer and the hidden layers consist of convolutional layers and pooling layers followed by fully connected 
layers that are used as output layers [22]. 

The convolutional layer is the power of a CNN. In this layer, a mathematical operation called 
convolution is utilized. The convolution operation is carried out by applying a filter, usually called a 
convolutional kernel that slides over the input and applies appropriate mathematical operations between them 
to extract certain features or patterns for the original input. CNN may include more than one convolution 
layer, where the first hidden layers extract simple and clear features, and as it goes deeper into the hidden 
layers, the complexity of the features increases. Therefore, all values resulting from the convolution process 
are stored in the feature map. Thereafter, the outputs are normally passed through an activation function like 
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the rectified linear unit (RLU). In the pooling layer, the size of the activation map is reduced. This minimizes 
the number of parameters and computations, which prevents overfitting. The process is carried out by 
applying one of the max or average functions. The most popular function is max pooling with a small 
window to keep the largest values and reduce the map size. The fully connected layer is the layer through 
which the classification process takes place. A fully connected multi-layer perception with softmax activation 
function is utilized. The output is an array of values; each value represents the probability that the current 
input belongs to one of the considered classes. The winner class is the one with the largest 
probability [23], [24]. 

In this study, a one dimensional (1D CNN) is considered for the task at hand. For the proposed 1D 
CNN, the convolutional layer creates a filter that passes over a single temporal dimension for the EOG signal 
to produce a tensor of outputs. Figure 6 shows the architecture of the proposed 1D CNN. The proposed 1D 
CNN model is built from four basic blocks. Each of them consists of two convolutional layers to extract the 
main features using the RLU activation function and one max pooling layer to reduce the size of the feature 
map and recording the most distinctive features. The four blocks differ in terms of the size of the filters in the 
convolutional layer (32, 64, 215, and 1,024). Thereafter, a dropout layer is added with a percentage of 50% to 
prevent the model from overfitting at each update of the training phase. Finally, a flatten layer is added to put 
the feature vector as input to the output layer, which is a fully connected layer consisting of six output classes 
(up, down, right, left, double blink, and center). the softmax activation function is utilized to compute their 
probabilities to classify the signal according to the maximum probability. The best results are achieved by 
training the model for 200 epochs using Adam optimizer with a learning rate equals to 0.0001. 
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Figure 6. The proposed 1D CNN design 


3.3.2. Visual geometry group network 

VGG network represents a pre-trained model of CNN that was proposed by the visual geometry 
group of Oxford University [25]. It was one of the best and famous models that have been submitted to 
imagenet large scale visual recognition challenge (ILSVRC2014). Where it has performed very well and 
achieved an accuracy of 92.7% with the imagenet dataset (14 million images with 1,000 classes) [25]. 

In this study, the input signal is passed to a stack of 1D convolutional layers by using the kernels 
with a small size to capture the patterns from all signals. Figure 7 shows the utilized VGG architecture in this 
study. The convolutional layers are included in five blocks. The number of filters utilized is 64, 128, 256, 
512, and 512 for the five blocks, respectively. The spatial pooling is performed by five max-pooling layers, 
each one of them follows a block of the convolutional layers. Finally, there are three fully connected layers 
that have different depths with dropouts to reduce any overfitting. In the last connected layer, the softmax 
activation function is used to produce the probabilistic value for each of the six classes (up, down, right, left, 
double blink, and center). The input signal is classified to the class with the largest probabilistic value. The 
best results have been achieved, when the model is trained by 250 epochs using Adam optimizer with the 
valid padding. Considering that the batch size is 64 to obtain the best distinct features and to reach the proper 
classification of EOG signals. 
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Figure 7. VGG network architecture 
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3.3.3. Inception network 

Inception network [26] has been launched to create new significant innovations for deep learning 
models. There are two models of the Inception network: The naive inception model and the dimension 
reductions module. The naive inception model (Figure 8(a)) [26] encompasses multiple blocks. Each block 
includes multiple convolutional layers that work on the same level. Each layer consists of a number of filters 
with a specific kernel size (1x1, 3x3, and 5x5). Finally, the concatenation of outputs is reduced by applying 
max-pooling operation and sent to the next inception block. 

In this study, the inception model with dimension reduction is applied for the task in hand as shown 
in Figure 8(b). 1x1 convolutional layers are applied before the 3x3 and 5x5 convolutional layers, and also 
after the pooling layer. Thus, each inception module is reduced to decrease computational cost and avoid any 
overfitting. In the last layer, the softmax activation function is used to produce the probabilistic value for 
each class of the six classes (up, down, right, left, double blink, and center). The input signal is classified to 
the class with the largest probabilistic value. The proposed architecture is based on five sequential inception 
modules. For each module, the outputs are concatenated and sent to the next inception module as inputs. The 
best results are obtained by training the model over 100 epochs using Adam optimizer with a learning rate 
equals to 0.0001. Considering that the batch size is 32 to obtain the best distinct features and to reach the 
proper classification of EOG signals. 


Figure 8. Inception model version (a) inception model naive version and (b) inception module with 
dimension reduction 


3.4. User interface 

The graphical user interface has been designed in such a way that paralyzed users can write 
messages and texts on the computer using their eye movements in an easy and completely non-stressful way. 
The provision of all facilities that enable the disabled to communicate effectively with their community in all 
life situations has been taken into consideration. That they can live better and freely communicating their 
words and ideas. 

The main window has four actions (Figure 9(a)): 1) the user can write data and messages by looking 
to the left and selecting "writing" and thus go to the virtual keyboard window (Figure 9(b)); ii) user can select 
from daily activities, by looking at the right and chooses “daily activities”, which takes the user to another 
window (Figure 9(c)) where the common daily activities are listed in such a way the user can choose from 
them also by eye movements; iii) user can see the latest news loaded from different websites by looking up in 
the main window and chooses “ see latest news” (Figure 9(d)); iv) user can terminate the application by 
looking down in the main window and chooses “exit”. Choosing any option or character is done by double 
blinking. 

In the writing window (Figure 9(b)), the characters and numbers are divided into four groups: up, 
down, right, and left in a way consistent with the basic eye movements. First, the cursor is on the button in 
the middle of the window. When the user wants to write a particular letter, the user should look in the 
direction of the group where the letter is located and select it through a double blinking movement. After 
selecting the group, the user moves the curser through eye movements in the direction where the character 
exists in that group and selects it. Each chosen character will be written in the text box. To speed up the 
process and do not strain the user's eyes, an auto-completion feature is added to the application through a 
dictionary. The user only selects the initial letters of the word and then a list of words matching these letters 
appears on the left of the window so that the user can choose the desired word from it. Moreover, if the user 
chooses the button “speak”, all that is written in the text box will be pronounced. 
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Figure 9. The user interface system (a) system main window, (b) writing window, (c) daily activities window, 
and (d) last 24-hour news window 


4. EXPERIMENTAL RESULTS 

The experiments have been conducted using the data set described in section 3.1. Five-fold cross 
validation is considered for evaluation. The dataset includes six categories: left, right, up, down, center, and 
blinking. The horizontal and vertical EOG signals are preprocessed as discussed in section 3.2. The three 
considered models CNN, VGG, and inception have been examined and evaluated by four measurements: 
sensitivity, specificity, precision and overall accuracy. These criteria are extracted from the confusion matrix 
for each of the deep learning models as (1)-(4): 


Sensitivity(Recall) = TP/(TP + FN) (1) 
Specificity = TN/(TN + FP) (2) 
Precision = TP/(TP + FP) (3) 
Overall Accuracy = (TP + TN)/(TP+TN +FP+FN) (4) 


Where TP is true positive, TN is true negative, FP is false positive, FN is false negative. Tables 1-3 show the 
achieved results. Figure 10 summarizes the results which reveal the superiority of the proposed inception 
model. Moreover, Table 4 provides a comparison with previous studies which reveals the superiority of the 
proposed models. Moreover, the proposed system needs 1 sec for performing one complete selection. 


Table 1. The achieved accuracies using CNN 


Movement classes Accuracy (%) Sensitivity (%) Specificity (%) Precision (%) Overall accuracy (%) 
Down 94.4 94 99.2 95.9 98.3 
Up 95.4 95 99 95 98.3 
Left 97.2 97 99 95.09 98.6 
Right 97.2 97 99 95.09 98.6 
Blink 93.6 94 99.2 95.9 98.3 
Center 96.6 97 99.4 97 99 
Average+stander deviation 95.7+1.51 95.6+1.5 99.1+0.16 95.6+0.77 98.5+0.28 
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Table 2. The achieved accuracies using VGG network 


Movement classes Accuracy (%) Sensitivity (%) Specificity (%) Precision (%) Overall accuracy (%) 

Down 90.8 91 98.4 91.9 97.1 

Up 90.6 91 98 90.09 96.8 

Left 96 96 98.4 92.3 98 

Right 94.4 94 98.4 92.1 97.6 

Blink 89.2 89 99 94.6 97.3 

Center 93 93 98.6 93 97.6 
Average+stander deviation 92.3+2.58 92.3+2.5 98.5+0.33 92.3+1.47 97.4+0.42 


Table 3. The achieved accuracies using inception network 


Movement classes Accuracy (%) _ Sensitivity (%) Specificity (%) _ Precision (%) _ Overall accuracy (%) 
Down 97 97 99.6 97.9 99.1 
Up 96.6 97 99.2 96.03 98.8 
Left 97.8 98 99.4 97.02 99.1 
Right 97.4 97 99.2 96.03 98.8 
Blink 93.8 94 99.2 95.9 98.3 
Center 95.8 96 99.2 96 98.6 
Average+stander deviation 96.4+1.45 96.5+1.38 99.3+0.17 96.5+0.81 98.8+0.31 
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Figure 10. Summarized results for the three deep learning models 


Table 4. Comparison between the proposed work and previous studies 


Study Dataset Preprocessing Feature extraction Classification Accuracy (%) 

Zhang et al. [14] 8 subjects Wavelet filtering L1-norm, kurtosis and entropy SVM 84.4 
Ramkumar et al. [27] 10 subjects Band pass filter Parseval theorems TDNN 89.6 
FFNN 94.1 

Banerjee et al. [28] 4 subjects Low pass filter Wavelet coefficients, auto- KNN 80 

High pass filter regressive and power spectral 
density 
Usakli and Gurkan [29] 20 subjects Band pass filter Nearest 95 
neighborhood 

Proposed method 50 subjects Band pass filter CNN 95.7 
VGG 92.3 
Inception 96.4 


5. CONCLUSION 

This research paper proposes a system for writing controlled by EOG signals. This system helps all 
patients who suffer from diseases, that cause them severe motion disabilities and paralyze their limbs. Thus, 
there is nothing left for them to communicate with their community except the movement of their eyes. 
Hence, the proposed system is based on detection and recognition of six categories of different eye 
movements; up, down, right, left, center, and blinking captured by EOG signals. These signals are filtered 
using second order butter worth band pass filter, then both vertical and horizontal EOG signals are 
concatenated in one vector to represent the input to three deep learning models. The three investigated 
models in this study are CNN, VGG and inception. The results reveal the superiority of Inception model that 
has achieved an average accuracy of 96.4% against the other two considered models and the existing studies. 
In addition, the processing time is one second for one selection to be accomplished which is comparable with 
the existing studies. Moreover, a user interface is proposed in this study and consists of four windows: the 
first one is the main window that displays all the provided features; the second window represents a virtual 
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keyboard for writing texts; the third window includes the daily activities that patients may need; the last 
window displays the up-to-date news gathered from famous news sites. Finally, we are looking forward to 
improving the processing time and examine other state of art deep learning models. 
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